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ABSTRACT 



This paper is concerned with objectives- based 
evaluation and alternative ways in which a system of objectives and 
test items might contribute to school programs. In the first study, 
teachers, parents, and students were involved in the needs assessment 
phase of educational evaluation with the use of behavioral 
objectives. All three were first asked to rate the importance of each 
objective for their school situation. Each group was then asked 
questions pertaining to these objectives. Among the results was a 
tendency for both parents and students to mis-predict pupil 
achievement. Teachers made relatively good predictions. The purpose 
of the second study was to compare the performance of learners taught 
by teachers trained or not trained by a three day PROBE institute in 
the use of behavioral objectives. A three day workshop was held for 
27 fourth grade social science teachers. To assess the effects of the 
workshop, a performance test was used where six objectives assembled 
for fourth grade social science were employed. No significant 
differences were found between students of the trained r.nd untrained 
teachers. (KJ) 
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EXPERIMENTAL ASSESSMENT OF THE EFFECTS OF THE PROBE SYSTEM 1 

Eva L. Baker 

Center for the Study of Evaluation 
University of California, Los Angeles 



It takes time, lack of reinforcement, and more time to eradicate 
oppositions of the heart. One such, now undergoing extinction by all but 
zealots, is that the act of providing objectives and coordinate test items to 
teachers in itself wi 1 1 modify the nature of educational practice. The 
experience of the regional laboratories and our own anectodal evidence urge 
the production of a fairly elaborate support system if we wish any innovation, 
in this case, object ives-based evaluation, to be effectively installed. The 
research in implementation problems which shall be described represents a 
preliminary attempt to determine the requirements of such support and to 
explore alternative ways in which a system of objectives and test items 
might contribute to school orograms. 



STUDY ONE: COMMUNITY EVALUATION 



While the notion of evaluation almost always implies assessment following 
some segment of instructional program, the utility of object ives-based 
evaluation in a needs assessment function was explored in a predominately 
black junior high school. The intent of the project was to involve parents, 
teachers and students in identifying objectives of common and discrepant 
interest and to aid in the school's planning of instructional programs to 
facilitate achievement of target goals. The procedure actively sought ,'J 

community input but the questions raised were those of values, i.e., what should \ 

the goals of the schools be, rather than of means, e.g., how many minority 1 

teachers should we have. Objectives from the most complete Collection in the 1 

PROBE files were used. One advance limitation was that the subject matter of j 

the Collection was mathematics. But the study has functioned as a procedural 
prototype for future investigations. 

i 

Overview 

Teachers, parents and students were involved in the needs assessment phase j 



Administration of the studies and data analyses were supervised by Ted Dahl 
of the staff of the Center for the Study of Evaluation. 

^e research and develooinent reported herein was performed at the Center for 
the Study of Evaluation, UCLA, pursuant to a contract with the U.S. Office of 
Education, Department of Health, Education and Welfare, under the provisions 
of the Cooperative Research Program. 
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of educational evaluation with the use of behavioral objectives. All three 
groups were first asked to rate the importance of each objective for their 
school situation. Teachers also indicated if objectives were among those 
they ordinarily taught and to estimate their classes' level of performance 
on the objective. Parents were asked to indicate whether they felt their 
child could currently master each objective. Learners were asked to predict 
their own performance on the objectives. To aid in respondents' understanding, 
each objective was clarified by an example of a test item which would measure 
it. Students were then tested on the objectives to determine their actual 
performance. 

Procedure 



Teacher Data . Ten teachers, each instructing two classes of seventh 
grade mathematics, were provided with 43 objectives and sample test items 
taken from the PROBE collection for grades 6, 7, and 8. Teachers rated 
objectives on a five point scale (1=1ow, 5 = high) in terms of importance, 
estimated the percentage of pupils who could achieve the objective and 
indicated if the objective was normally taught in their classes. Of the 
43 objectives presented to the teachers, 15 had been previously identified 
for use in the study (See Figure 1). These dealt with important arithmetic 
operations, scientific notation, measurement and geometry. 

Learner Data . In these teachers' classes a total of 634 students were 
asked to complete a questionnaire containing the target 15 objectives and 
sample items. Students were to rate each objective in terms of importance 
and to indicate whether they felt they could solve problems like the one 
•presented in the sample test item. Following administration 
of the questionnaire, learners were tested on the 15 objectives which they 
had rated. In order to limit the time necessary for testing, five separate 
test forms were devised. On each form three of the objectives were intensively 
sampled with eight items each, while the other 12 objectives were measured 
with three items each, resulting in a 60 item test. 

Parent Data . Questionnaires were mailed to 164 parents of students. All 
parents in three complete classrooms were sent letters and four parents from 
each of the other 17 classrooms were sampled at random. Only 123 of the 
letters we«-e received, as there were 41 letters with inaccurate addresses. 
Parents were asked to respond to the same 15 objectives and test items, 
estimating if their child could achieve the objective, and rating importance 
from 1 to 5* Parents were also asked to indicate if they felt community 
participation of this type was useful and whether they would be willing to 
participate in another survey. Parents who did not respond within three 
weeks of the first mailing received a second letter, and following four 
weeks a third letter was sent. 

Resu 1 ts 

The results of this study are mixed, both in terms of utility and 
valence. Eighty- two parents, or 67 per cent of the correctly addressed 
letters, responded to the questionnaire, 29 parents after the first mailing, 

36 after the second and 17 after the third letter was sent. 
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Analysis of variance was conducted on the parents* responses to the 
objectives according to the time of response to the questionnaire (after the 
first letter, the second letter, or the third letter)* Significant differences 
were, found for responses to only two of the 15 objectives, which were rated 
considerably lower by parents who responded to the third request* 

Parental ratings of the 15 objectives were uniformly high, with a mean 
across all 15 objectives of 4*3* Students were less sanguine about the 
objectives, with a mean rating of 3*9* Teachers' ratings averaged 3*9* 

An analysis of variance of the three groups' ratings for each objective was 
conducted* Significant differences were found for 11 of the 15 objectives 
(Numbers 1, and 5-14) and for the average rating across all objectives* 
Differences in five of those instances were attributable to the high 
rating which the parents reoorted* 

(Insert Table 1 about here*) 



The meanings of these ratings may be related to the specific nature 
of each objective considered* Objectives most favorably rated by teachers 
dealt with set equivalences, whole number arithmetic operations, and trans- 
lations of fractions when shown a pictorial representation* One interpretation 
of these results is that the teachers prefer practical and basic concepts in 
arithmetic, but a less optimistic colleague suggested that the teachers might 
have preferred what they felt was easiest to teach* Parents generally rated 
objectives in arithmetic operations highest but also favored a word problem 
task* Students rated objectives of a somewhat esoteric nature more important* 
Possibly because they were unfamiliar with the content involved, students 
favored objectives such as the writing of numerals in scientific and expanded 
notation, finding the circumference of a circle and finding the area of 
common geometrical figures* Correlations among the ratings are presented 
in Table 2 where a tendency for parents and students to agree with each other 
and disagree with the teachers can be observed* 

(Insert Table 2 about here*) 

Comparisons of the abilities of teachers, parents and students to predict 
performance with achievement levels oroduced findings of considerable interest* 
(See Table 3*) Both students and parents underestimate the competencies of 
the students, with parents predicting about 24 per cent performance and students 
predicting 22 per cent performance* Actual mean achievement of learners' 
achievement on all objectives was 43 per cent* Teachers, on the other hand, 
were more optimistic about the ability levels of the students, and predicted 
their pupils' achievement at around 54 per cent* Such over -predict ion might 
deflate the argument that teachers in predominately black schools tend to 
under-predict their students' ability* 

Separate correlation coefficients were computed between (1) each of the 
three groups' predictions of student achievement and (2) actual student 
achievement* The mean correlations for each of the 15 objectives were 
calculated and are presented in Table 4, 



it 

Ability to convert decimals written in expanded notation (F=8*09), Ability 
to solve word problems dealing with multiplication (F=5*76)* 
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( Insert Table 4 about here.) 

The relatively high negative correlations with achievement reflect a 
tendency for both parents and students to mis-predict pupil achievement. The 
predictions of parents and students show some consistency. Looking at these 
correlations conjointly with the means of predictions and achievement of the 
objectives, one might infer that both students and parents are consistently 
underoredi cti ng their performance. Teachers, on the other hand, make 
relatively good predictions of their classes' performance levels and disagree 
with both student and parent estimates of oerformance. Parents who responded 
were oositive about this tyoe of involvement in school ooerations. Eighty-four% 
indicated that they thought the oroject was a good idea and 83 % expressed 
willingness to respond to another questionnaire. 

Impl i cations 

Results of this investigation have prompted the school to seek specific 
helo in areas of deficiency in student performance. Seven of the 15 objectives 
received an average rating of four or above by at least two of the groups. 

Of these, four objectives were the lowest in terms of student achievement. 

While it is obvious that only limited, substantive aoplications can be 
made from objectives in the field of mathematics, where the nature of the 
subject matter limits curriculum decisions, both the school and staff of the 
Center were encouraged, not only by the willingness of the parents, teachers 
and pupils to participate, but more generally with the ootential utility of the 
procedure. Replications in American History and Black Studies are now 
underway. 

An incidental, but oossibly important, result is that the personnel of the 
school itself, at first reserved about the consequences of specific research 
projects on their daily operations, has reported oositive acceptance of this 
procedure. Teachers were particularly cooperative and did not feel that the 
research investigation was an artificial interruption in the normal activities 
of the school. 



STUDY TWO: EVALUATION OF TRAINED TEACHERS EFFECTS ON THE 

BEHAVIOR CHANGES THEY PRODUCE IN THEIR LEARNERS 



A central issue in the development of the PROBE system of objectives and 
items is the definition of the support requirements necessary to get the 
procedure into widespread use. The assumption has always been made in 
PROBE that some training experience for teachers would be provided so they 
could begin to understand and capitalize on the use of objectives and test 
items to improve their evaluations. Such an argument assumes that the 
purpose of the system is ultimately to improve the effects of educational 
practice rather than to function primarily to describe the status of educa- 
tional programs. 
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The purpose of a second study was to compare the performance of learners 
taught by teachers trained or not trained by a three day PROBE institute in 
the use of behavioral objectives. Performance tests as deoendent measures 
have been employed before and although the idea of a short training situation 
might immediately produce teacher behavior changes strong enough to affect 
puDil achievement was wildly hopeful, we decided to verify the immediate 
consequences of such training. In addition, we gathered information regarding 
a number of other procedures relevant to PROBE, including teacher’s responses 
to the items, the objectives and their use of pretest data. 

Subiects 



Six school districts within easy testing distance of the Center for the 
Study of Evaluation were contacted weeks prior to the institute and asked if 
they would submit the names of at least 10 volunteer fourth grade social 
science teachers to participate in a three day training session on the use of 
behavioral objectives. Fifty-four teachers volunteered for the session and, 
blocking by district, twenty-seven were randomly assigned to participate in 
the training. 

Treatment 



While a proper training program would focus on instructional methodology, 
e.g., the uses of iterative testing procedures, and would optimally provide 
practice for the teachers in these behaviors in a classroom context, the 
PROBE staff adopted a ’’lean programming" ^ strategy rather than attempt to 
develop a total teacher education program. Limitations of district in-service 
training resources (for example, paying substitute teachers while regular 
teachers are undergoing training) encouraged the PROBE training institute 
to focus on the fewest objectives 'which we thought could possibly do the job. 

A three day workshoo was conducted at UCLA during the first week in 
October, 1969* The workshop was planned to toal 18 hours, but the amount of 
instructional time actually spent was less than 12 hours. An additional 
purpose for the workshoo was the hope that it would represent a first 
generation attemot at an instructional training package for eventual experts- ~ 
tion to either a network of coooerating schools for dissemination purposes 
or directly to user districts. 

The objectives of the institute called for participants, at its conclusion, 
to be able to: 

1. discriminate between statements of behavioral objectives 

2. write behavioral objectives 

3. write possible entry and en route behaviors for given instructional 
objectives 

4. discriminate between examples of relevant and irrelevant practice 
for given objectives 

5. produce instances of relevant practice for stated objectives 

6. generate additional items for objectives when presented with a sample 
i tern 



2 

Markle, Susan M., Good Frames and Bad, Second Edition. John Wiley and Sons, 

New York, 19&9* 

3 

Popham, W. James and Baker, Eva L., "Validation Results: A Performance Test 

of Teaching Proficiency." PaDer oresented at the .annual meeting of the American 
Educational Research Association, Chicago, Illinois, February 7-10, 1968. 




7. prepare lessons which exhibited the following components: 
a* task analysis of objective 

b. relevant practice 

c. iterative testing and remediation cycles* 

All participants were given a Dretest in which their ability to perform 
the objectives was assessed* Following seven hours of instruction* they 
received a criterion check to monitor their progress toward the objectives, 
and at the completion of the institute a posttest was given. An 80 per cent 
criterion level was set to indicate mastery of the workshop's objectives. 

Our training was not terribly effective, since only 50 per cent of the 
teachers reached this desired criterion level* 

Criterion Measure 

To assess the effects of the workshop, a performance test was used 
where six objectives assembled for fourth grade social science were employed. 

The objectives focused on the translation and interpretation of graphed 
data, a task not as yet treated in the participating districts' programs* 

Before the instruction began, over 1,600 children in all 54 classrooms 
were pretested on the six objectives and means of their class for each of 
the six objectives were reported. A1 1 teachers received these data, the objecti 
and sample items ten days prior to the scheduled instructional period. 

Following a seven day instructional period, where teachers devoted approximately 
30 minutes a day to this topic, children were given a 12 item posttest by the 
Center staff* Test items measuring these geography objectives had been tried 
out previous on seven fourth grade classes in another community, critiqued 
by the seven teachers, revised, readministered to six other fourth grade 
classes, again critiqued and revised, prior to the administration of the 
pretest in the actual study* Teacher feedback in all cases was directed 
to the cohesiveness of the unit, item difficulty, reading level, and the 
extent to which the items were perceived as adequate measures of the objectives. 
During the posttest, teachers were asked to complete a questionnaire where 
they rated the utility of the unit, format of the objectives and items, and 
described the nature of the learning activities which they used. 

Analysis and Resul ts 

Analysis of covariance was computed for the posttests of fourth grade 
students, using pretest scores as a covariate. Total posttest means of 54 
classrooms were the entries in the analysis, corresponding to the number of 
teachers involved in the study. No significant differences ware obtained. 
Looking at the experimental group of teachers only, fourteen of twenty-seven 
reached the 80 per cent criterion level on the training posttesta For this 
group the "treatment" as it was conceptualized should show its g recx.es £ 
effects. Analyses of covariance, comparing successful and unsuccessful 
teachers on the institute (N. = l4, N2 = 13) oosttest, and successful and control 
teachers (N«=l4, N2 = 27) yielded no significant difference on the total 
student oosctest means. When analysis of covariance was conducted for each 
of the six objectives, a significant difference was found for objective two. 



Considering the number of analyses computed, such a finding is 
as a random event. While oerformance between treatment groups 
consistent, there were disparate oerformance levels on each of 
considered. Performance levels of the learners on each of the 
presented in Table 5. 



best explained 
was remarkably 
the six objectives 
objectives are 



(Insert Table 5 about here.) 

Performance for both treatment groups was considerably higher on objectives 
one and two, objectives which measure recall skill rather than any translation 
or application of information. For the other objectives, class performance 
was relatively poor, although there was improvement displayed on each objective. 
Teachers in the trained treatment reported that they spent approximately 1 85 
minutes on instruction, while the comparison group spent about 200 minutes. 

The trained teachers reported than an average of 85 additional minutes would 
be necessary to have students reach a satisfactory criterion level, while the 
untrained teachers estimated that 400 additional minutes, or almost twice again 
the instructional time originally allocated for the unit? would be necessary. 

Inspecting the orocess data, that is, the number of activities described 
in the questionnaire, two research assistants independently judged the 
activities in terms of relevance to the objectives. Teachers in the trained 
group produced approximately two and one half times as many relevant activities, 
but because of the immense va'iation within groups, this difference was not 
significant. When teachers' responses to the questionnaire were inspected, 
a correlation of .36 (n-54, p^.Ol) was found between whether the teacher 
considered the materials useful and total student achievement. Differences in 
attitude toward use of materials based on treatment condition were not found 
to be significant. 

Oi scussion 



There are a number of plausible interpretations of the negative" findings 
on puoil achievement. Optimistically, one might contend that the time allocated 
in the criterion task, seven days of instruction, was not sufficient for a 
teacher to institute teaching, testing, and reteaching cycles and thus "improve" 
his instruction. Another explanation might clearly indict the impotence of 
the original training time. Twelve hours of instruction might be insufficient 
to modify substantially pedagogical habits produced by years of teaching. 

Yet, we did have at least half of the group attain the mastery level we had 
hoped for in the training and no differences in their puoil s' achievement were 
found. For a moment, it looked as if we had affected their instructional 
activities in the reports but that was a mirage. Had we condueteH the training 
in a way purely consistent with the approach we advocated, we would have provided 
reinforced classroom practice in applying the verbal behaviors tin. .sachers 
were learning. But such a procedure was precluded by the decision to integrate 
this training program unobtrusively into usual district practice: 

To explain the lack of differences found on the teacher attitude measures, 
we might examine the teachers who were involved. The districts did not mandate 
attendance, and thus we assigned both training and control treatments to 
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volunteers. Those teachers were thus el reedy somewhe t oositive towerd 
behavioral objectives, end leek of differences in ettitude dete might not be 
difficult to explein. Both groups evereged 2.5 on e three point scele for 
e question esking if teechers would be willing to use objectives end items 
in other subject matter areas. 

The last alternative is, of course, that we don’t know what the 
critical comoonents of such a training program really are and that our 
instruction was wholly inadequate, not just in the lack of practice opportunities 
for the teachers, but in concept. The results of this study, particularly 
with regard to the teachers who mastered our objectives, indicate that short- 
term installation programs for disseminating new practices might be viewed more 

skeptically. Certainly, PROBE can't get along with a oackaqed traininq 
institute alone. a 

The studies described were both directed toward the practical problems 
of helping schools to make use of the resources which PROBE offers. The 
community- based evaluation study investigated the use of specific statements 
of goals as a means to involve parents and students in the program decision- 
maxing. The second study tried to determine if decisions made by teachers 
in the use of PROBE were enhanced by a workshop experience. Research of a 
oractical nature will necessarily be continued by the PROBE staff, since our 

conce-n is directed to those procedures which can ultimately make a chanqe in 
the effect of the schools. 




Figure 1. Mathematics Objectives 



To add, subtract, multioly or divide measures. 

Given a set of numbers, the student will compute the average. 

To rename a decimal numeral using scientific notation. 

Given a numeral written in expanded notation, the student will name the 
decimal numeral for the indicated sum. 

To identify equal and equivalent sets. 

To add, subtract or multioly decimals. 

To find the greatest common factor of a set of numbers. 

Given a decimal numeral, the student will round it off. 

To multiply whole numbers. 

To find area of a rectangle, square, triangle, oaral lelogram, or rhombus. 

Given a word problem involving two place mul tiol ication, the student 
will splve the problem. 

To add and subtract whole numbers. 

To write the fraction shown when given a picture. 

To subtract unlike fractions and reduce the answer to its simplest form. 
To find the ci rcumference of a circle. 



Table 1. Means and Standard Deviations of Ratings of Mathematics Objectives 
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Ratings of students of parents responding to questionnaire 
Ratings of number of students 
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Table 3. Means and Standard Deviations of Parents, Teachers and Students Predictions and Learner Acheivement 
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Table 5* P re and Posttest Proportions for Classes of Trained and Untrained Teachers for Six Objectives 
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